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Abstract — LT codes and digital fountain techniques have re- 
ceived significant attention from both academics and industry 
in the past few years. There have also been extensive interests 
in applying LT code techniques to distributed storage systems 
such as cloud data storage in recent years. However, Plank and 
Thomason's experimental results show that LDPC code performs 
well only asymptotically when the number of data fragments 
increases and it has the worst performance for small number 
of data fragments (e.g., less than 100). In their INFOCOM 
2012 paper, Cao, Yu, Yang, Lou, and Hou proposed to use 
exhaustive search approach to find a deterministic LT code that 
could be used to decode the original data content correctly in 
distributed storage systems. However, by Plank and Thomason's 
experimental results, it is not clear whether the exhaustive 
search approach will work efficiently or even correctly. This 
paper carries out the theoretical analysis on the feasibility and 
performance issues for applying LT codes to distributed storage 
systems. By employing the underlying ideas of efficient Belief 
Propagation (BP) decoding process in LT codes, this paper 
introduces two classes of codes called flat BP-XOR codes and 
array BP-XOR codes (which can be considered as a deterministic 
version of LT codes). We will show the equivalence between 
the edge-colored graph model and degree-one-and-two encoding 
symbols based array BP-XOR codes. Using this equivalence 
result, we are able to design general array BP-XOR codes using 
graph based results. Similarly, based on this equivalence result, 
we are able to get new results for edge-colored graph models 
using results from array BP-XOR codes. 

I. Introduction 

Due to the advancement of cloud computing technologies, 
there has been an increased interest for individuals and busi- 
ness entities to move their data from traditional private data 
center to cloud servers. Indeed, even popular storage service 
providers such as Dropbox use third party cloud storage 
providers such as Amazon's Simple Storage Service (S3) for 
data storage. 

With the wide adoption of cloud computing and storage 
technologies, it is important to consider data security and 
reliability issues that are strongly related to the underlying 
storage services. Though it is interesting to consider general 
data security issues in cloud computing environments, this 
paper will concentrate on the basic question of reliable data 
storage in the cloud and specifically the coding techniques for 
data storage in the cloud. There has been extensive research 
in reliable data storage on disk drives. For example, redundant 
array of independent disks (RAID) techniques have been 
proposed and widely adopted to combine multiple disk drive 



components into a logical unit for better resilience, perfor- 
mance, and capacity. The well known solutions to address the 
data storage reliability are to add data redundancy to multiple 
drivers. There are basically two ways to add the redundancy: 
data mirror (e.g., RAID 1) and data stripping with erasure 
codes (e.g., RAID 2 to RAID 6). Though data mirror (or data 
replication) provides the straightforward way for simple data 
management and repair of data on corrupted drives, it is very 
expensive to implement and deploy due to its high demand for 
redundancy. In addition to data replication techniques, erasure 
codes can be used to achieve the required data reliability 
level with much less data redundancy. Note that though error 
correcting codes (e.g., Reed-Solomon codes) could also be 
used for reliable data storage and correcting errors from failed 
disk drives, it is normally not used for data storage since it 
needs expensive computation for both encoding and decoding 
processes. 

Erasure codes that have been used for reliable data storage 
systems are mainly binary linear codes which are essentially 
XOR-operation based codes. For example, flat XOR codes 
are erasure codes in which parity disks are calculated as the 
XOR of some subset of data disks. Though it is desirable 
to have MDS (maximal distance separable) flat XOR codes, 
it may not be available for all scenarios. Non-MDS codes 
have also been used in storage systems (e.g., the replicated 
RAID configurations such as RAID 10, RAID 50, and RAID 
60). However, we have not seen any systematic research 
in designing non-MDS codes with flat XOR operations for 
storage systems. 

In order to achieve better fault tolerance with minimal 
redundancy in data storage systems, there has also been active 
research in XOR based codes which are not necessarily flat 
XOR codes. For example, Blaum, Brady, Bruck, and Menon 
G) proposed the array code EVENODD for tolerating two 
disk faults and correcting one disk errors. Blaum, Bruck, and 
Vardy [4] and Huang ifFTl have extended the construction of 
EVENODD code to general codes for tolerating three disk 
faults. Other non-flat XOR based codes include (but are not 
limited to) [2k, k, d] chain code, Simple Product Code (SPC 
HI), Row-Diagonal Parity (RDP 0), and X-code l30l . 

The techniques that we have discussed above have been 
originally designed for data storage on disk drives and in 
storage area networks. It may not be directly applicable 
to distributed storage services such as the storage cloud. 
There have been many researches addressing the data storage 
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reliability issues in distributed environments. For example, 
Weatherspoon and Kubiatowicz |28] have compared erasure 
coding based solutions and replication based solutions for 
reliable distributed data storage systems. 

Based on the seminal work of low-density parity-check 
(LDPC) codes by Gallager [15], several important techniques 
(see, e.g., Luby, Mitzenmacher, Shokrollahi, Spielman, and 
Stemann [21 1) have been developed for networked and com- 
munication systems such as Digital Fountain for content 
streaming. There have also been extensive interests in applying 
digital fountain techniques (such as LT codes) to distributed 
storage systems (see, e.g., Plank and Thomason |23| and 
Cao, Yu, Yang, Lou, and Hou Q). However, it is not clear 
whether these applications of LT codes to distributed storage 
systems have advantages over other techniques that have been 
extensively adopted in disk drives and storage area networks 
such as flat XOR codes or array XOR codes. 

Plank and Thomason's experimental results [23 1 show that 
LDPC code performs well only asymptotically when the 
number of data packages increases and it has the worst 
performance for small number of data fragments (e.g., less 
than 100). Cao, Yu, Yang, Lou, and Hou ji7| proposed to 
use exhaustive search approach to find a deterministic LT 
code that could be used to decode the original data content 
correctly in distributed storage systems. However, by Plank 
and Thomason's experimental results, it is not clear whether 
the exhaustive search approach will work efficiently or even 
correctly. In this paper, we carry out a theoretical analysis on 
the feasibility and performance issues for applying LT codes 
to distributed storage systems. By employing the underlying 
ideas of efficient Belief Propagation (BP) decoding process in 
LT codes ([20 1), we introduce two classes of codes called flat 
BP-XOR codes and array BP-XOR codes. The BP-XOR codes 
and array BP-XOR codes can be considered as a deterministic 
version of LT codes though flat BP-XOR codes are different 
from LT codes. Edge-colored graph models were introduced 
by Wang and Desmedt in l27l to model homogeneous faults 
in networks. We will show the equivalence between the 
edge-colored graph model and degree-one-and-two encoding 
symbol based array BP-XOR codes. Using this equivalence 
result, we are able to design general array BP-XOR codes 
using graph based results. Similarly, based on this equivalence 
result, we are able to get new results for edge-colored graph 
models using results from array BP-XOR codes. We have 
implemented an online software package for users to generate 
array BP-XOR code with their own specification and to verify 
the validity of their array BP-XOR codes (see ll26l ). 

The structure of this paper is as follows. In Section [H] 
we briefly review several coding techniques proposed for dis- 
tributed storage systems. Sections [ill] presents the challenges 
in applying LT codes to distributed storage systems. Section 



from graph based results (e.g., perfect one factorization of 
complete graphs), and Section VII proves a theorem on the 
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introduces flat BP-XOR codes for distributed storage 
systems and investigates the necessary and sufficient bounds 
for the existence of the codes. Section [V] introduces array 
BP-XOR codes for distributed storage systems and establish 
the equivalence between edge-colored graph models and array 
BP-XOR codes with degree one and two encoding symbols. 



limitation of array BP-XOR codes using only degree one and 
two encoding symbols. 

II. Coding Techniques for Distributed Storage 
Systems 



In the seminal paper [24|, Rabin proposed the Information 
Dispersal Algorithm (IDA) to code a data file into n pieces that 
will be stored among n servers such that the recovery of the 
information is possible when there are at most t = n — k failed 
servers (inactive but not malicious Byzantine style servers). 
Rabin's scheme is essentially a kind of Reed-Solomon codes 
and needs relatively expensive finite field operations for en- 
coding and decoding. 

Krawczyk [19[ extended Rabin's IDA scheme to address 
the Byzantine style malicious servers which may intentionally 
modify their pieces of the information. Krawczyk called his 
scheme as Secure Information Dispersal Algorithm (SIDA). 

There have been extensive interests in applying random 
linear coding techniques to distributed storage systems. For 
example, Dimakis, Ramchandran, Wu, and Suh ifTTI and Di- 
makis, Godfrey, Wu, Wainwright, and Ramchandran 1 1 1 used 
information flow graphs and random linear coding to achieve 
information theoretic minimum functional repair bandwidth 

Tmin = 2fcd-fc^+fc • ^ n anomer word, if we divide the file 
F into k pieces, store the encoded fragments in n storage 
servers, and if one of these server fails, then a new comer 
storage server could functionally repair the system only if it 
could communicate 7 mm /d bits from each of the d surviving 
storage servers. 

Inspired by network coding, Acedanski, Medard, and Koet- 
ter [l | proposed to use random linear coding for distributed 
networked storage with one centralized server and multiple 
storage servers. Dimakis, Prabhakaran, and Ramchandran lfl2ll 
considered the problem from a different approach: There are 
n storage servers and many distributed data sources (that is, 
data are not from a central location and there is no centralized 
server). Each data source node picks one out of the n storage 
servers randomly, pre-routes its packet and repeat d(k) = 
clnfc times. Each storage server multiplies what it receives 
with coefficients selected uniformly and independently from 
F q and stores the results together with the coefficients. 

III. Challenges in Applying LT codes to 
distributed storage systems 



Section VI presents constructions of array BP-XOR codes 



Luby [20 1 pointed out that one of the potential applications 
of LT codes is distributed data storage systems. Several 
authors have continued these ideas with fruitful outcomes. 
For example, Plank and Thomason l23l have considered the 
practical implementations of LDPC codes for peer-to-peer and 
distributed storage systems and several experimental results 
have been reported. In particular, the experimental results in 
1231 show that LDPC "codes display their worst performance 
for 10 < n < 100" where n is the number of data fragments. 
Furthermore, their experiments show that "generating good 
instances of the codes is a black art. ..there is an opportunity 
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for theoretical research on codes for small n to have a very 
wide-reaching impact". Recently, Cao, Yu, Yang, Lou, and 
Hou [7Q proposed a LT code based secure cloud storage 
service (LTCS). In the LTCS scheme, a data file F is split 
into A packets, each of which is |F|/A bits. LT coding 
process is then used to generate na encoded packets, where 
a = A/fc • (1 + s). These encoded packets are divided into n 
groups and each of the n storage servers receives a packets. 
One of the basic requirements from the paper [7| is that the 
original A data packets should always be recoverable from any 
k healthy servers. For LT codes, there is a small probability 
that one may not be able to reconstruct the original data 
packets from the k servers. In order to address this challenge, 
the authors in Q recommended an exhaustive search method 
to divide the encoded symbols into n groups and check the 
decodability for each k combinations of groups. The process 
continues until one finds out one valid LT coding approach. 
This approach is not efficient when k and n are relatively 
larger. Furthermore, there is no guarantee that the exhaustive 
search method will end with a valid LT code. The authors did 
not give any analysis on how efficient this approach could 
be or any proof whether that is feasible. For the case of 
a = 1, the coding scheme in Q is essentially a flat XOR 
code that requires a belief propagation (BP) decoder ([20]). 
The following example shows that the exhaustive search may 
not succeed for some cases with a = 1. 

Example 3.1: For n = 5 and k = 3, the original data is 
divided into three fragments vi , V2 , f3 and coding symbols 
are stored in 5 servers (Si,-- - ,S$) such that the original 
data could be recovered from any 3 servers. In order for 
the belief propagation (BP) decoder to work, we must start 
from an original copy of vi or t>2 or v$. Since there could be 
two erasure faulty servers, three servers have to store original 
copies of the data fragments. Without loss of generality, 
assume that Si, £2, and 63 store Vi,V2, and U3 respectively. 
Again, since there could be two erasure servers, each data 
fragment needs to be stored in at least three servers. Thus 
both S4 and S5 need to store v\ © 113. Now if both S\ 
and S*2 are faulty, neither v\ nor V2 could be recovered. □ 

For the case of a > 1, the coding scheme in Q is a kind of 
array XOR codes (not flat XOR codes). In this case, the robust 
soliton distribution will be used to generate the na encoding 
symbols. In order for the analysis and bounds of the LT code 
to work, the numbers n, a, and A in [7 1 have to be sufficiently 
large (this has been confirmed by the experiments in [23 1). For 
smaller values, these bounds may not work and the exhaustive 
search methods in [7] may never end with a successful code. 
However, for large enough n, a, and A, the exhaustive search 
method will be inefficient and may be infeasible. 



The experiment results from [23| and the potential chal- 
lenges in the scheme Q show that it is necessary and impor- 
tant to systematically study the encoding symbol generation 
problems for applying LT code to distributed storage systems. 
In the following sections, we will show what we could achieve 
and what we could not achieve with LT codes when applied 
to distributed storage systems. 



IV. Flat BP-XOR codes 

In this section, we introduce a class of codes called flat 
BP-XOR codes. In short, flat BP-XOR codes are flat XOR 
codes that could be decoded with the Belief Propagation (BP) 
algorithm for erasure codes. The BP algorithm for binary 
symmetric channels is present in Gallager [ 15 1 and is also used 
in artificial intelligence community El . In our paper, we use 
the BP algorithm for binary erasure channels (see 1211 . [20|). 

Let M = {0, 1}' be the message symbol set. The length 
I could be any number and it does not have impact on the 
coding. An [n, k, d] flat BP-XOR code is a binary linear code 
determined by a k x n zero-one valued generator matrix G such 
that for a given message vector x g M k , the corresponding 
code y g M™ is computed as y = xG where the addition of 
two strings in M is defined as the XOR on bits. Furthermore, 
a flat [n, k, d] BP-XOR code requires that if at most d — 1 
components in y are missing, then x could be recovered from 
the remaining components of y with the Belief Propagation 
algorithm, 

It is easy to see that each flat BP-XOR code is a flat XOR 
code, but the other direction may not hold. We can consider 
flat BP-XOR codes as one kind of applications of LT codes 
to reliable storage system design with deterministic decoding. 



Example 3. 1 shows that flat BP-XOR version of the LT code 



may not be applicable to threshold based distributed storage 
systems as proposed in [7|. In the following, we first mention 
the folklore fact to support our arguments. 

Fact 4.1: Let n > k + 2, k > 2, and d = n - k + 1. Then 
there is no [n, k, d] BP-XOR code. 

The fact could be easily proved by the following observa- 
tion: Let H = [Pi , ■ ■ ■ ,/3j|/ n _fc] be an (n — k) x n parity 
check matrix. If every n — k columns in the matrix [0f\I n -k] 
are linearly independent, then wt(f3i) = n — k, where wt(-) 
is the Hamming weight. Thus for n > k + 2, there is neither 
binary linear [n, k, d] code nor [n, k, d] BP-XOR code. 

Fact 4.1 shows the impossibility of designing flat [n,k,d] 
BP-XOR codes for n > k + 2 and d = n - k + 1. Since 
flat BP-XOR codes are extremely efficient for encoding and 
decoding in practice, we are also interested in flat BP-XOR 
codes that are not MDS (maximal distance separable). In the 
following we show theoretical bounds designing flat BP-XOR 
codes for distributed data storage systems. 

For an MDS [n, k, d] code with d = n — k + 1, we can 
tolerate d— 1 erasure faults. The question that we are interested 
in is: for given n > k + 2, what is best distance d we could 
achieve for a flat [n, fc, d] BP-XOR code? Fact 4.1 shows that 
d must be less than n — k + 1. 

Tolerating one erasure fault: Let a e {l} fc . The generator 
matrix [/fc|a T ] corresponds to the MDS flat [k + 1, fc, 2] BP- 
XOR code that could tolerate one erasure fault. 

Tolerating two erasure faults: Fact |4.1| shows that two 
parity check servers are not sufficient to tolerate two erasure 
faults for flat BP-XOR codes. In order to tolerate two erasures, 
we have to consider codes with n > k + 3. For n — k + 3, the 
following generator matrices show the existence of flat [5, 2, 3], 
[6, 3, 3], and [7, 4, 3] BP-XOR codes for tolerating two erasure 
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faults. 



1 1 
1 1 



1 1 

1 1 
111 



1 1 

1 1 
1 1 
1 1 1 



fc 


required redundancy 


BP-XOR code 


2 < fc < 4 


3 


[fc + 3,fc,3] 


5 < k < 11 


4 


[A; + 4, fc, 3] 


12 < k < 26 


5 


[fc + 5,fc,3] 


27 < k < 57 


6 


[fc + 6,fc,3] 



that we have constructed in Theorem 4.2 are systematic. Based 



on the proof of Theorem 4.2 we have the following corollary. 



Indeed, the above three codes are the only flat [A: + 3, fc, 3] BP- 
XOR codes tolerating two erasure faults with three redundancy 
columns. 

Theorem 4.2: For n > k + 3 and k > 3, there exists a flat 
[n, k, 3] BP-XOR code if and only if k < 2™~ fe - (n - k) - 1. 
Proof. Let H = [ftf , • • • ,ftT|I n _ fc ] be an (n - fc) x ?i 
parity check matrix. The code determined by iJ has minimum 
distance 3 if and only if every 2 columns in H are linearly 
independent. This implies that H is the parity check matrix of 
a flat [n, k, 3] BP-XOR code if and only if for every ft, we 
have wt(/3i) > 2 where wt(-) is the Hamming weight. By the 
fact that 

| {p € {0, l}"- fc : wt((3) > 2} | = 2"- fc _ ( n - A) - 1, 

it follows that there exists a flat [n, k, 3]BP-XOR code if and 
only if k < 2 n ~ k - (n - k) - 1. □ 
Note: It should be noted that the codes we have constructed 
in Theorem |4.2| is the well known Hamming code when k = 
2 n - k - (n - k) - 1. For k < 2 n - k - (n - k) - 1, it is a 
truncated version of the Hamming code. 

By Theorem |4~2] there is no flat [k + 4, k, 3] BP-XOR code 



for k > 12. Table |I lists the required redundancy for tolerating 
two erasure faults when the value of k changes. All the codes 

TABLE I 

Redundancy for flat BP-XOR [n, k, 3] codes 



Corollary 4.3: For n > k and k > 3, there exists an [n, fc, 3] 
binary linear code if and only if there exists a systematic 
[n, fc, 3] binary linear code and if and only if there exists a 
systematic flat [n, fc, 3] BP-XOR code. 

Tolerating three erasure faults: We first prove the follow- 
ing theorem for the convenience of proving the existence of 
systematic flat XOR codes. 

Theorem 4.4: For n > k and d < n — k + 1, there exists 
an [n,k,d] binary linear code code if and only if there exists 
an (n — fc) x fc matrix A = (j3f , • • • , 0E) with the following 
properties: 

1) ft € {0,l}"- fc for 1 < i < k 

2) Let d\+d2 = d—1. If we remove c?2 rows from A, then 
every d\ columns of the remaining matrix are linearly 
independent. 

Proof. First, it is straightforward to show that the condition 
[2] in the Theorem implies that wt(Pi) > d — 1 for 1 < i < k, 



where wt(-) is the Hamming weight. It is also straightforward 
to show that the condition [2] in the Theorem implies that 
every d—1 columns in the matrix [A|/ n _fc] are linearly 
independent. Thus, the linear code corresponding to the parity 
check matrix [^lin-^] is a binary linear [n, fc, d] code. Note 
that the generator matrix corresponding to the parity check 
matrix [A\I n _ k ] is G = [h\A T ]. 

For the other direction, assume that there exists a fc x n 
generator matrix G for an [n, fc, d] binary linear code. Let 
, a n be the n columns of G. Without loss of generality, 
we may assume that a 1; • • • , are linearly independent. We 
may also assume that 

/ al 



V 

where A = (fif , ■ 




z—k 



,0£) and ft e {0,1}" 
Since the code has the minimum distance d, the remaining 
generator matrix G should have a rank of fc after removing any 
d—1 columns from G. Let d\ +c?2 = d—1 and assume that we 
remove d\ columns , • • • , for i u < k and ^2 columns 
OLk+jit ' " * > a k+j d2 for j u < n — fc from the generator matrix 
G. Then , • • • , ai d should be able to be linearly generated 
from the columns a, for i > k + 1 and i ^ ji, - ■ ■ , id-, ■ This 
is equivalent to the requirements that the rows ii, • • • , of 
Ik could be linearly generated from the remaining rows of A 
after removing the rows j\, • • • , jd 2 from A. It follows that the 
remaining columns i\, • • • , of A are linearly independent 
after removing the rows j\, • • • , jd 2 from A. This completes 
the proof of the Theorem. □ 
By Theorem |4.4| we have the following results. 
Theorem 4.5: For n > fc + 4, there exists a systematic flat 
XOR [n, fc, 4] code if and only if 
-fc-i 
-fe-i 



fc < 



n ■ 
n ■ 



fc 
k 



1 



if n 
if n 



fc is even 
fc is odd 



Proof. Let 

X = {/? : /3 e {0, l} n -^i(/3) = 3, 5, 7, • • • }. 

Then 

\X\ = 



E 


(V 


is odd 




E 


it 


is odd 




in— k— 1 _ 


n + fc 


■ n— fc— 1 


n + fc 



if n — fc is even 
fc — 1 if rt — fc is odd 

Define an (n— fc) x fc matrix A = (/3f , • • • , /3j) where ft g X. 
It is straightforward to show that this matrix A satisfies the 
condition 2 of Theorem 4.4 for d — 4 (alternatively, every 
three columns in the parity check matrix [A|/ n _fe] are linearly 
independent). Thus the binary linear code corresponding to the 
parity check matrix [A\ I n -k\ ( or me generator matrix [If,\ A T ]) 
is a flat XOR [n, fc, 4] code. 

For the other direction, it suffices to show that X is a 
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maximal set that satisfies the condition 2 of Theorem 14.41 for 
d = 4. This is proved by observing the fact that every even 
Hamming weight vector /3 G {0, l} n ~ fe is equal to (3i+f3 2 f° r 
some fti, f3 2 € X. This completes the proof of the theorem. □ 
In Theorem 4.5 we established a necessary and sufficient 
condition for designing systematic flat XOR codes tolerating 
three erasure faults. However, the codes we constructed in 



Theorem 4.5 are not necessarily flat BP-XOR codes. For 
example, let n = 7, k = 3, d = 4, and /3i = (1,1,1,0), 
02 = (0, 1, 1, 1), and f3 3 = (1, 0, 1, 1). Then the corresponding 
code has the following generator matrix: 

1110 
111 
10 11 

It is straightforward that this is not a flat BP-XOR code 
since if we remove the first three columns from the above 
generator matrix, no column in the remaining generator matrix 
has Hamming weight 1. Indeed, it is easy to show that for 
n = 7, k = 3, and d = 4, there is no flat [7, 3, 4] BP-XOR 
code. The reason is that in order for a [7, 3, 4] linear code to 
be a flat BP-XOR code, we have to have four columns with 
Hamming weight 1 in the generator matrix. Furthermore, we 
need to have Hamming weight 4 for each row. Without loss of 
generality, we may assume that the column (1,0, 0) T occurs 
twice in the generator matrix. Then we have to have three 
columns in the generator matrix with the format (b, 1, 1) T 
where b = 0, 1. In another word, two columns of the generator 
matrix are identical, which will reduce the code distance to 3. 

The above discussion shows that the condition in Theorem 
14.51 is not valid for the existence of flat BP-XOR code 
tolerating three erasure faults. Though it is interesting to 
identify necessary and sufficient conditions for the existence 
of BP-XOR codes tolerating three or more erasure faults, it 
is sufficient for us to use the flat XOR codes in distributed 
storage systems since a simple XOR based Gauss elimination 
methods could be used to recover the original data content in 
front of erasure faults. This observation tells us that LT code 
(i.e., the flat BP-XOR code) may not be the best choices for 
distributed storage systems in some cases. 

As an example, Table [II] lists the required redundancy for 
tolerating three erasure faults when the value of k changes. 

TABLE II 

Redundancy for flat XOR [n, k, 4] codes 



k 


required redundancy 


flat XOR code 


2<k<4 


4 


[A + 4.M] 


5 < k < 10 


5 


[fc + 5,M] 


11 < k < 26 


6 


[fc + 6,fc,4] 


27 < k < 56 


7 


[& + 7.M] 



Tolerating four or more erasure faults: In general, we are 
also interested in designing flat BP-XOR codes for tolerating 
more than three erasure faults. For distributed storage systems 
we could generally use nested techniques (e.g., the similar 
techniques as nested RAID array). In the following, we present 
several sufficient conditions for tolerating four or more erasure 
faults. Normally these conditions are not necessary. It should 



be noted that for general binary linear codes, there are well 
known bounds (see, e.g., Verhoeff (25)). However, the codes 
corresponding to these bounds are not necessarily flat BP-XOR 
codes. 

Theorem 4.6: For n > k + 5, there exists a systematic flat 
XOR [n, k, 5] code if k is less than 

n — k 



n-k-2 


+ 2 


K 


2 







2 /2 



-2) /2 

n — k 



Proof. Let U — {ai, • • • , a n -k} be an n — k element set. In 
the following, we construct four-element subsets of U so that 
the characteristic sequences of these subsets could be used as 
the columns of the parity check matrix. It will be convenient 
for the reader to understand the following subset definitions 
if the elements of U are interpreted as leaf nodes on a binary 
tree of depth Llog 2 (« — k)\. 

Vi = {oi,02,o 2 i+i,a2i+2} for 1 < i < [ n ^ 2 \ , 



V 



2.1 



3.0 



3,1 



= { ai ,a 3 ,a 4z+1 ,a 4l+3 } for 1 < i < [(\^]-2)/2\ 

= {a 2 ,a 4 ,a ii+2l a 4i+3 } for 1 < i < [(T^l - 2) /2j 

= {ai,a 5 ,a 8i+1 ,a 8i+5 } for 1 < i < [(\^]-2)/2\ 

= {a 4 ,a s ,a 4l+2 ,a 8l+5 } for 1 < i < [(\^]-2)/2\ 



Let (3i , ■ ■ ■ , f3 w be the characteristic sequences of the above 
sets. Then it is straightforward that the parity check matrix 
H = [f3f , ■ ■ ■ , /3^|/„_fc] corresponds to a systematic flat XOR 
code of minimum distance 5. The code has distance 5 since 
every 4 columns in H are linearly independent by the facts 
that (1) for any (3 1 ,f3 2 , we have wt{Px + j3 2 ) > 2; and (2) 
any three or four f3 are linearly independent. The two facts 
follow from the construction. This completes the Proof of the 
Theorem. □ 
As an example, Table [ITl|lists the required redundancy of flat 
XOR codes for tolerating 4 erasure faults based on Theorem 
4.6 Though the conditions in Theorem 4.6 are not necessary 



TABLE III 

Redundancy for flat XOR [n, k, 5] codes 



k 


required redundancy 


flat XOR code 


k<2 


6 


[k + 6, k, 5] 


3 < k < 4 


7 


[k + 7, k, 5] 


5 < k < 9 


8 


[fc + 8,fc,5] 



in general. The bounds in Table [TTT] matches the bounds for 
general binary linear codes (see [25 1). Thus the conditions in 



Theorem 4.6 are also necessary for k < 9. 



V. 



Array BP-XOR codes for distributed storage 

SYSTEMS 



Array codes have been studied extensively for burst error 
correction in communication systems and storage systems (see, 
e.g., El, H, 0, @, El, 129], ED). Array codes are linear 
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codes where information and parity data are placed in a two 
dimensional matrix array. Appropriately designed array codes 
such as EVENODD [3|, RDP [9], STAR Q7), X-code (HI 
are very useful for high speed storage application systems 
since they enjoy low-complexity decoding and low update 
complexity. Most of these array codes are designed for RAID 
array based storage systems with the specific requirements 
such as systematic code, efficient decoding algorithm, and 
minimum update complexity, where update complexity refers 
to the number of encoding data symbols that need to be 
updated if one information data symbol is changed. 

For distributed storage systems such as cloud storage, we 
may not need the code to be systematic. As studied in |20|, 
ll23l . Q, LT code or digital fountain techniques could be a 
better choice for distributed storage systems. However, as we 
have mentioned in previous sections and as supported by the 
experimental results in ll23ll . the probabilistic bounds in LT 
code performs well only asymptotically when the number n of 
encoding symbols increase. For small numbers of n < 100, the 
codes display their worst performance. Thus it is important to 
study the applicable coding schemes with better performance 
for distributed storage systems. As we have noticed, one of 
the major advantages that contribute to the efficiency of LT 
decoding process is the Belief Propagation (BP) process. In 
the following, we design a kind of array codes that could be 
efficiently decoded using the BP-process. We will call such 
kind of codes array BP-XOR codes. Appropriately designed 
array BP-XOR codes could achieve the MDS property from 
both communication and storage aspects: for k blocks of the 
original data, surviving storage servers only need to store 
k blocks of encoding blocks. Note that in LT codes, in 
order to decode k blocks of data with probability 1 — 6, 
k + 0(Vkhi 2 (k/5)) blocks of encoding blocks are needed. 

Array BP-XOR code is defined as follows. Let 
ati, ■ ■ ■ , ctfc € {0, 1}' be the data fragments that we want to 
encode, where I is any fixed number. A i-erasure tolerating 
array BP-XOR code isanmxn matrix C — [&i,j]i<i<m,i<j<n 
such that: 

1) Each crjj is the XOR of one or more elements from the 
data fragments ai, ■ ■ ■ , ctfe. 

2) a>\, ■ ■ ■ , ctk could be recovered from any n — t columns 
of the matrix using the binary erasure channel based BP 
algorithm. 

If we add the restriction that each element in C = 
[<Ti,j]i<i<m.i<j<n be the XOR of at most two elements from 
the data fragments a\,- ■ ■ , a^, then the restricted array BP- 
XOR codes are equivalent to the edge-colored graph models 
introduced by Wang and Desmedt in BTll for tolerating net- 
work homogeneous faults. 

A. Edge-colored graphs 

In this section, we first describe the edge-colored graph 
model by Wang and Desmedt [27|. The reader should be 
reminded that the edge-colored graph model in [27] is slightly 
different from the edge-colored graph definition in most litera- 
tures. In most literatures, the coloring of the edges is required 



to meet the condition that no two adjacent edges have the same 
color. This condition is not required in the definition of |27|. 

Definition 5.1: (Wang and Desmedt fl27l ) An edge-colored 
graph is a tuple G(V,E,C,f), with V the node set, E the 
edge set, C the color set, and / a map from E onto G. The 
structure 

Z c ,t = {Z : ZC E and \f(Z)\ < t}. 

is called a t-color adversary structure. Let A, B £ V be 
distinct nodes of G. A, B are called (i + l)-color connected 
for t > 1 if for any color set Ct Q C of size t, there is a path 
p from A to B in G such that the edges on p do not contain 
any color in Ct- An edge-colored graph G is (t + l)-color 
connected if and only if for any two nodes A and B in G, 
they are (t + 1) -color connected. 

As an example, Figure [T] contains two 3 -color connected 
edge-colored graphs G^i and G 4i 2. G41 contains 5 nodes, 8 
edges, and 4 colors. G 4j 2 contains 7 nodes, 12 edges, and 4 
colors. 



vl 




v3 v4 v6 



(a) G 4 ,i (b) G 4 , 2 

Fig. 1 . 3-color connected edge-colored graphs 

A general 3-color connected edge-colored graph with k 
nodes can be constructed as follows. 

1) For k = 4r+l, the v\ node of r copies of G4.1 are glued 
together to form a 3-color connected edge-colored graph 
G with 4r + 1 nodes and 8r edges. 

2) For k = 4r + 3, the v\ node of r — 1 copies of G41 
and one copy of G±_2 are glued together to form a 3- 
color connected edge-colored graph G with 4r+3 nodes, 
8r + 4 edges. 

3) For k = 4r + 2 (respectively k = 4r + 4) with r > 
1, one node is added to the 3-color connected edge- 
colored graph G with k = 4r + 1 nodes (respectively 
k = 4r + 3 nodes) by connecting this node to any 3 
nodes within the graph with distinct colors. The resulting 
graph is a 4-color connected edge-colored graph with 
4r+2 (respectively 4r+4) nodes and 8r+3 (respectively 
8r + 7) edges. 

For convenience, an edge-colored graph could also be 
represented by a table, where the edges with same colors are 
put in the same column. For example, G44 and G 4i 2 in Figure 
[Tj are represented in Table [rV] 

Wang and Desmedt [27| showed several constructions of 
edge-colored graphs with certain color connectivity. In the fol- 
lowing, we present a general construction of (< + l)-color con- 
nected edge-colored graphs using perfect one-factorizations of 
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TABLE IV 

Table representation of edge-colored graphs G4 1 and €4,2 



TABLE VI 
BP-XOR CODE CORRESPONDING TO G4 2 



G4,l 


(vi,v 2 ) 


(vi,v 5 ) 


(«1,U3> 


(wi,w 4 ) 


Gi,2 


(vi,v 2 ) 
(v 3 ,v 6 ) 


(V2,V 3 ) 

(vi,v 5 ) 

(v 3 ,V 7 ) 


(vi,v 3 ) 


(V2,V 5 ) 



complete graphs. We use K n — (V, E) to denote the complete 
graph with n nodes. For an even n, a one-factor of K n is a 
set of pairwise disjoint edges that partition the set of nodes 
in V. A one-factorization of K n (n is even) is a set of one- 
factors that partition the set of edges E. A one-factorization 
is called perfect if the union of every two distinct one-factors 
is a Hamiltonian circuit. It is shown (see, e.g., Anderson [2] 
and Kobayashi [18] ) that perfect one-factorizations for K p+1 , 
K 2p , and certain K 2n do exist, where p is a prime number. 

Theorem 5.2: Let n be an even number such that there is 
a perfect one-factorization F\,--- ,F n -\ for K n . For each 
t < n — 3, there exists a (t + 1) -color connected edge-colored 
graph G with n— 1 nodes, (t + 2)(n/2 — 1) edges, and t + 2 
colors. 

Proof. Let V = {v lr -- ,w„_i}, F[ = F< \ {{v n ,v)}, and 
E = F[ U • • • U F{. +2 and color all edges in F[ with color Cj 
for i < t + 2. Then it is straightforward to check that the edge- 
colored graph (V, E) is (t + l)-color connected, \V\ = n — 1, 

and \E\ = (t + 2)(n/2 - 1). □ 



Remarks on Proof of Theorem 5.2 Since only node connec- 
tivity instead of Hamiltonian circuit is required for (i+l)-color 
connected graphs, we could use F[ instead of F{ to construct 
the edge-colored graphs. By using F[, we reduce t + 2 edges 
and one node in the resulting edge-colored graph. This will 
help us to keep the minimum cost for connectivity. 

B. Constructing array BP-XOR codes from edge-colored 
graphs 

We now use edge-colored graphs to construct array BP- 
XOR codes. As an example, we first give the BP-XOR code 



corresponding to the graph G4 2 in Table IV As Step 1, the 
G4 2 part in Table IV is converted to the code in Table [V] 
In the step 2, choose any fixed node and remove all of its 



TABLE V 
First step code for G4 2 



Vi © v 2 


V 2 © v 3 


V4 © V 5 


V 2 © V 5 


V 3 © l> 6 


Vi © H 5 


Vq © 1>7 


1>1 © V4 


V4 © V7 


V 3 © V 7 


Vi © V 3 


V4 © V 6 



occurrence from the code in Table [V] For convenience, we 
choose to remove the occurrence of v 7 and get the BP-XOR 
code in Table (VT] 

It is easy to check that the data fragments v\ , ■ ■ ■ ,vq can be 
recovered from any two columns of coding symbols. It is also 



straightforward to observe that the code in Table VI achieves 
optimal space and communication bandwidth in the event of 
two column erasures. 



1>1 © v 2 


v 2 © v 3 


V4 © v 5 


v 2 © v 5 


V 3 © V 6 


Vi © v 5 




V\ © V4 


V4 


v 3 


Vi © v 3 


V4 © Vq 



In the following, we give the general construction of BP- 
XOR code from edge-colored graphs. Let vi,v 2 ,--- ,Vk € 
{0, 1}' be data blocks that we want to encode, where I is any 
fixed length. Let G(V,E,C,f) be a (t + l)-color connected 
edge-colored graph with V — {«!,••• ,Vk,Vk+i}, \E\ = A, 
and C = {ci,c 2 ,--- ,c n }. If we consider the nodes in the 
edge-colored graph G(V, E, C, f) as data blocks, edges as 
their parity check blocks of the adjacent nodes, and colors on 
the edges as labels for placing the parity checks into different 
columns of the array codes, then following steps construct an 
m x n array BP-XOR codes, where m = max cSC 7{|-^| : Z C 
EJ(Z)=c}. 

1) For 1 < i < n, let 

P'i = {"i © Vj : (v u vj) e E, f((v hVj )) = a}. 

2) For each replace the entry vt+i © v with v if such 
entry exists. Furthermore, if |/3 2 '| is smaller than m, add 
empty element to fi[ to make it an m-length vector ft. 

3) The array BP-XOR code is then specified by the to x n 
matrix C G = ■ 

Next we show that the above defined array BP-XOR code 
Cg can tolerate t column erasure faults. Let Ct C C be any 
set of t colors of the graph G and assume that t columns 
corresponding to the color set Ct are missing in Cq- Since 
the graph G is (t + 1) -color connected, for any node u, e V, 
we have a path p = (v k+1 ,v h ,v i2 ,--- ,v Zj , v io ) without using 
any colors in Ct- Thus Vi could be recovered by the following 
equation 

Vi = v h © (v h © v i2 ) © •• • © (v ij © v io ) 

where ,Vi 1 © Vi 2 , • • • , Vi - © Vi are all contained in the 
non-missing columns. Thus the Belief Propagation process 
could be used to recover the entire data blocks vi , ■ ■ ■ , 
from the non-missing columns with only k XOR operations 
on the encoding symbols. 

C. Constructing edge-colored graphs from array BP-XOR 
codes 

In this section, we show that for each array BP-XOR code, 
we could construct a corresponding edge-colored graph. 

Theorem 5.3: Let C be an to x n array BP-XOR code with 
the following properties: 

1) C is t-erasure tolerating; 

2) C contains k information symbols; and 

3) C contains only degree one and two encoding symbols. 
Then there exists a (i+l)-color connected edge-colored graph 
G(V, E, C, f) with \V\ = k+ 1, \E\ = mn, and \C\ = n. 

Proof. Let v\ , ■ ■ ■ be the information symbols of C = 
[ a *j](»,j)e[i,m]x[L«] and v n , ■ ■ ■ ,v in be a list of degree one 
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encoding symbols in C. Then the (i+l)-color connected edge- 
colored graph G(V, E, C, f ) is defined by the following steps: 

1) V = {vi, ■ ■ ■ ,v k ,v k+ i}; 

2) E = U je [i >u ]{(vk+i,v i:i )} U {(vi>,Vj>) : a itj = v v © 
vy e C}; 

3) C = {ci, • • • ,c„}; 

4) for each asjj = © vy e C, let f((vi',vy)) — Cj and 
for each Oi,j = € C let f({v k+ i, vy)) = cj 

Let C( be a color set of size t and Uj and Vj be two nodes. 
Since the code C is t-erasure tolerating, both Vi and Vj 
could be recovered from encoding symbols not contained in 
the columns corresponding to the colors in C\. Thus there 
exists a path p (q respectively) connecting v k +i to Vi (to ir- 
respectively) without using C t -colored edges. It follows that 
G(V, E, C, /) is (t + l)-color connected. □ 

VI. Examples of bandwidth optimal array BP-XOR 

CODES 



TABLE Vm 
(p - l)/2 X p BP-XOR CODE 



In this section, we use edge-colored graphs in Theorem 5.2 
to construct m x n BP-XOR codes that could tolerate n — 2 
erasure columns. The general process is as follows: For a given 
number n of code columns and a number t of erasure columns, 
we first design (t + 1) -color connected edge-colored graphs 
with n colors and the smallest number of graph edges. The 
resulting edge-colored graph is then converted to the BP-XOR 
code with the process described in the previous section. 

In order to design an to x n BP-XOR code tolerating n — 2 
erasure columns, find the smallest p (or 2p) such that n < p 
(or n < 2p — 1), where p is an odd prime. Suppose p is 
such a prime with n < p. Then Table VII defines a (p — 1)- 



color connected edge-colored graphs with p nodes and p colors 
(based on the perfect one-factorization of K p+ i in lll8lD . In 

TABLE VII 

(p - l)-COLOR CONNECTED EDGE-COLORED GRAPHS 







(v p ,Vp- 2 ) 






(Ul,«p_ 3 ) 








{V(p-l)/2, v (p+l)/2) 




(U(p-3)/2,U(p-l)/2) 



Table VII if we consider the first column as a sequence of 
numbers: l,p - 1; 2,p - 2; • • • ; (p - l)/2, (p + 1)/2, then the 
ith column of the table is defined by the following sequence 
of numbers (operations are mod p and is replaced with p): 

l + i,p-l + i;2 + i,p-2 + i;--- ; (p-l)/2 + i, (p+l)/2 + i. 

The above edge-colored graph is then converted to the (p — 
l)/2xp BP-XOR code in Table |VLII| where m = (p- l)/2. 
Then an m x n BP-XOR code is obtained by taking any of 
the n columns in Table |VIII Since the edge-colored graph 



in Table VII is (p — l)-color connected, it follows that the 
above constructed to x n BP-XOR code could tolerate n — 2 
erasure columns. In another word, the original data content 
F is divided into p — 1 fragments vi, ■ ■ ■ , i> p _i € {0, 1}' of 
equal length (/ bits) and are stored in n servers according to 
the BP-XOR codes, then the original data content F could 
always be recovered from any two surviving servers. It should 



Vi © Vp-i 




Vp-l © V p - 3 


Vp-2 


V2 © V p -2 




Vp-A 


Vi © Vp- 3 










V m © V m+ l 




Vm-2 © W m _i 


V m -x © V m 



also be noted that each storage server stores (p — l)Z/2 bits 
of data and the total data stored at two storage servers are 
(p - 1)1 = \F\ bits. Thus the BP-XOR code is optimal in 
bandwidth and space. 

We should also note that the (p- l)/2 x p BP-XOR code 



in Table VLII is equivalent to the code designed by Zaitsev, 
Zinov'ev, and Semakov [14] which was reformulated later as 
the dual code of B-code in ||29| using perfect one-factorization 
of complete graphs. 

VII. The limitation of degree two encoding 

SYMBOLS 

In this section we analyze the limitation of array BP-XOR 
codes when only degree one and two encoding symbols are 
allowed. Using the results for array BP-XOR codes, we will 
get new results for edge-colored graph models. 

For a t-erasure tolerating array BP-XOR code of size mxn, 
we could achieve space and bandwidth optimal property if 
there are k = (n — t)m information symbols of same length. 
The following theorem provides a necessary condition for the 
existence of array BP-XOR codes when only degree one and 
two encoding symbols are used. 



Theorem 7.1: Let C = [c 



*ijJ(»:i)S[l,m] X [l,n] 



be a i-erasure 



tolerating array BP-XOR code with k = (n — t)m information 
symbols and C only use degree one and two encoding symbols. 
Assume that n = n — t > 2, then we have 



n < 



n - 1 



no 



no — 2 \ (no — 2)m + 1 

Proof. By the fact that C is <-erasure tolerating, each 
information symbol must occur in at least t + 1 columns. 
Since there are nom information symbols (data fragments) to 
encode, the total number of information symbol occurrences 
in C is at least nom(t + 1). 

In order for the BP decoding process to work, we must start 
from a degree one encoding symbol. Thus we need to have at 
least t+1 degree one encoding symbols in distinct columns of 
C. This implies that we could use at most ran — (t+1) cells 
to hold encoding symbols for degree two. In another word, 
C contains at most 2(mn — (t + l))+£+l occurrences of 
information symbols. By the above fact, we must have 

n a m(t + 1) < 2(mri - (t + 1)) + t + 1. 

By rearranging the terms, we get 

noTOn — n m(nQ — 1) < 2to,7j — (n — 1). 

If we move all terms to the right hand side, we get 

no(no — l)m — ((no — 2)m + l)n + (no — 1) > 0. 
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Finally, the above inequality could be rewritten as 



((n -2)m + l) 
That is, 



n (n - 1) 



n < 



n -l 
n -2 



n 



n 



n > 



2n 



Acknowledgements 
I would like to thank Duan Qi for some discussion on 



n n 



(ng — 2)ni + 1 
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Based on equation (TTJ, we get the necessary conditions for 
n for different no in Table [IX] The values in Table |IX] show 

TABLE IX 

Necessary conditions for n for different no 



n 


m 


n 


3 


[1,2] 


4 


3 


[3,oo] 


5 


[4,oo] 


[l,oo] 


n + 1 



that for degree one and two encoding symbols based array BP- 
XOR codes, if we want to recover the information symbols 
from more than three columns (i.e., no > 3) of encoding 
symbols, then we could only have one column redundancy 
for n > 4. 



Combining Theorem 7.1 and values in Table IX we get the 
following results for edge-colored graphs. 

Theorem 7.2: For a color set C with |C| > 5, if we want 
to design an edge-colored graph G(V, E, C, /) (or a network 
with more than |C| kinds of homogeneous devices) with 
minimum cost, then the edge-colored graph is robust against at 
most one color failures (or one brand of homogeneous devices 
failures). 

Proof. Based on the results in Theorems |5 . 3 1 17 . 1 1 and values 



in Table IX we can have the following conclusion: Given 
integers no, m and n, an edge-colored graph G(V, E, C, /) 
with |C| = n, |y| = nom + 1, and = nrn — (n — no), 
G(V, E, C, f) is (n — n )-color connected only n — n Q + 1. 
Thus the theorem follows. □ 



VIII. Conclusion 

Based on the BP (Belief Propagation) decoding process and 
the edge-colored graph model 1271 . we introduced flat BP- 
XOR codes and array BP-XOR codes. We have established 
the equivalence between edge-colored graphs and degree one 
and two based array BP-XOR codes. In particular, we used 
results in array BP-XOR codes to get new results in edge- 
colored graphs. For array BP-XOR codes with higher degree 
encoding symbols, we do not have general results yet. It 
would be interesting to have a compelete characterization 
of the existence and bounds for array BP-XOR codes with 
higher degree encoding symbols. These characterizations may 
be used to design more efficient LT codes or digital fountain 
techniques. We have implemented an online software package 
for users to generate array BP-XOR codes with their own 
specification and to verify the validity of their array BP-XOR 
codes (see ll26l ). 



Hamming code and Theorem 4.2 and thank Prof. Doug Stinson 
and Yvo Desmedt, for some discussions on edge-colored 
graphs, Hamiltonian circuit, and factorization of complete 
graphs. 
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